Using Embedded Network Processors to Implement Global Memory Management in a Workstation Cluster
نویسندگان
چکیده
Advances in network technology continue to improve the communication performance of workstation and PC clusters, making high-performance workstation-cluster computing increasingly viable. These hardware advances, however, are taxing traditional host-software network protocols to the breaking point. A modern gigabit network can swamp a host’s IO bus and processor, limitingcommunication performance and slowing computation unacceptably. Fortunately, host-programmable network processors used by these networks present a potential solution. Offloading selected host processing to these embedded network processors lowers host overhead and improves latency. This paper examines the use of embedded network processors to improve the performance of workstation-cluster global memory management. We have implemented a revised version of the GMS global memory system that eliminates host overhead by as much as 29% on active nodes and improves page fault latency by as much as 39%.
منابع مشابه
Global Memory Management for Workstation Networks
Global Memory Management for Workstation Networks by Michael Joseph Feeley Chairperson of the Supervisory Committee: Professor Henry M. Levy Department of Computer Science and Engineering Advances in network and processor technology have greatly changed the communication and computational power of local-area workstation networks. However, operating systems still treat workstation networks as a ...
متن کاملData Merging for Shared Memory Multiprocessors
Cache coherence, delayed consistency, shared memory multiprocessors We describe an e cient software cache consistency mechanism for shared memory multiprocessors that supports multiple writers and works for cache lines of any size. Our mechanism relies on the fact that, for a correct program, only the global memory needs a consistent view of the shared data between synchronization points. Our d...
متن کاملAcceleration of Optical-Flow Extraction Using Dynamically Reconfigurable ALU Arrays
An effective way to implement image processing applications is to use embedded processors with dynamically reconfigurable accelerator cores. The processing speed of these processors are not only depends on the parallelism, but also depend on the local memory utilization since the local memories are much faster than the global memory. In this paper, we accelerate the optical-flow extraction algo...
متن کاملCost-Performance Evaluation of SMP Clusters
Clusters of Personal Computers have been proposed as potential replacements for expensive compute servers. One limitation in the overall performance is the interconnection network. A possible solution is to use multiple processors on each node of the PC cluster. Parallel programs can then use the fast shared memory to exchange data within a node, and access the interconnection network to commun...
متن کاملA Commodity High Performance PC Cluster Project Centered on a Programmable Network Processor
This project aims to construct a non-dedicated, high performance PC cluster using an advanced network processor for solving CPUand memory-intensive applications. Early cluster projects either use commonplace network technologies such as Ethernet or use specialized, more advanced network technologies such as Myrinet to connect commodity computers to form a single parallel processing system. Rega...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999